National College Students’ Innovation and Entrepreneurship Program — Development of an Intelligent Aesthetic Education System Based on Large Language Models and Image Generation Models.

Type: project, NanKai University, 2023

In order to develop teenagers comprehensive abilities in intelligence, emotion, willpower, aesthetic ability, and achieve their all-round development, we are intending to develop an intelligent art education virtual platform based on large language models (LLM) and image generation models (stable diffusion). The platform will establish virtual reality models for famous artists from ancient and modern times, and users can interact with these artists, learn about their history, appreciate their works, and improve their aesthetic taste.

Project Rationale and Research Content

To alleviate the excessive academic pressure and anxiety among primary and secondary school students in China, and to advance the development of quality education, allowing students more time to experience and develop their interests and hobbies, China has introduced the “Double Reduction” policy. This policy, which encompasses a series of educational reform measures, includes “eliminating homework for primary and secondary school students, reducing their academic burden,” and “reducing the burden without compromising on quality, promoting quality education.” Under the “Double Reduction” policy, aesthetic education becomes particularly important. The “National Medium and Long-term Education Reform and Development Plan Outline (2010-2020)” calls for the comprehensive development of education, emphasizing the creation of a healthier educational environment and deepening cultural and aesthetic education to promote the holistic and harmonious development of students.

We have noticed that the current aesthetic education for adolescents in China is limited to classroom lectures and lacks interactive, conversational, and experiential products that allow them to genuinely appreciate the beauty of art. Existing products have various issues, such as only providing introductions to artists without the ability to ask questions about these introductions or only offering the original works of artists without truly experiencing the artist’s style. To develop the intelligence, emotions, will, aesthetic ability, and other comprehensive capabilities of adolescents, and to achieve their holistic development, we now intend to develop an intelligent aesthetic education virtual platform based on Large Language Models (LLM) and image generation models (like stable diffusion). This platform will establish virtual reality models of famous artists from the past and present, allowing users to interact with these artists, understand their history, and appreciate their works to enhance their aesthetic taste.

In this platform, users can learn in detail about the creation history and motivation of every famous artwork. They can freely ask questions to the virtual artists and input custom questions for free interaction. Additionally, users can upload their paintings and request artists to modify and process them in their style. Users can even generate their own desired artworks by inputting descriptive texts, allowing the artists to create a piece that meets their requirements and reflects an artistic style.

Due to the limitations in training corpora and other factors, current large language models cannot accurately provide information about artworks. Furthermore, image generation models also require additional training and adjustment to achieve the desired effects.

In summary, this project is forward-looking and operable, contributing to the reform and development of China’s education system. The core of the project lies in developing virtual artist models, allowing users to freely interact with them. This innovative aspect is also technically pioneering and exploratory.

Research Content, Objectives, and Technical Approach of the Project

The research content and objectives of the project are divided into the following aspects:

  1. Train a large language model capable of providing information about artists, answering user queries, and conversing with users, by modifying algorithms, adding some training data, and employing few-shot learning methods.

  2. Develop an image generation model capable of offering artworks, modifying user’s works, and providing images in an artistic style consistent with the corresponding textual description, through modifying algorithms, adding some training data, and employing few-shot learning methods.

  3. Fine-tune the base model of stablediffusion to generate images with various artistic styles. Use the image-to-image feature and controlnet of the stablediffusion model for personalized control over the content generation, achieving the artistic modification of existing works and assisting in painting.

  4. Develop a metaverse platform based on virtual reality technology, with more detailed modeling of mainstream artists, and modules for conversational exchange and image uploading. Create a metaverse platform where users can freely move, visit, and interact with virtual artists in a hall-like setting.

The technical approach of the project is divided into the following aspects:

  1. Data Collection and Preprocessing: Collect famous artists’ works, history, and biographies, and process the text using natural language processing techniques for cleaning, segmentation, and vectorization.

  2. Model Training and Optimization: Train and optimize based on large language models (such as GPT-3) and image generation models (such as Stable Diffusion), enabling them to freely generate artistic works as per user requests and naturally respond to user queries.

  3. Virtualization of Artists: Establish virtual artist models, integrating information about famous artists’ lives, works, and artistic styles into the models for free interaction with users.

  4. Interaction Interface Design: Design a user-friendly interface, allowing users to freely browse and learn about famous artists’ history and works, as well as communicate with virtual artists.

  5. Platform Deployment and Launch: Deploy and launch the developed platform for easy access and use by users.

  6. Continuous Iteration and Optimization: Continuously collect user feedback and data, and persistently optimize and improve platform functionalities and user experience, enhancing user satisfaction and usage experience.